{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Toxicity Model Configuration\n",
    "\n",
    "[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/whylabs/LanguageToolkit/blob/main/langkit/examples/Toxicity_Model_Configuration.ipynb)\n",
    "\n",
    "In this example, we'll show you how you can use different toxicity models to extract the `toxicity` metric from a prompt/response with LangKit."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Install LangKit\n",
    "\n",
    "First let's install __LangKit__."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install langkit[all]==0.0.30"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "By default, Langkit uses the [`martin-ha/toxic-comment-model`](https://huggingface.co/martin-ha/toxic-comment-model) model. That's the model that will be used if you simply call `toxicity.init()`.\n",
    "\n",
    "We can also call it explictly:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'prompt': 'I hate you!', 'prompt.toxicity': 0.9164737462997437}"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from langkit import toxicity, extract\n",
    "# this is the default model\n",
    "toxicity.init(model_path=\"martin-ha/toxic-comment-model\")\n",
    "results = extract({\"prompt\": \"I hate you!\"})\n",
    "\n",
    "results"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`toxic-comment-model` seems to be very sensitive with regards to punctuation. For example, try removing the exclamation mark above and see how the toxicity score changes."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Langkit also supports toxicity models from Unitary's [detoxify](https://github.com/unitaryai/detoxify) package. For example, to use detoxify's `unbiased` model, you can call `toxicity.init(model_path=\"detoxify/unbiased\")`, like this:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Results - detoxify/unbiased:\n",
      " {'prompt': 'I hate you.', 'prompt.toxicity': 0.81225103}\n"
     ]
    }
   ],
   "source": [
    "toxicity.init(model_path=\"detoxify/unbiased\")\n",
    "\n",
    "results = extract({\"prompt\": \"I hate you.\"})\n",
    "print(f\"Results - detoxify/unbiased:\\n {results}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "On premilinary internal tests, the `unbiased` model seems to perform better than the default model at a cost of a slower inference time. If you want to sacrifice latency for a boost in accuracy, you might want to experiment with the `unbiased` model.\n",
    "\n",
    "In addition to `unbiased`, you can also use the `original` and `multilingual` models from detoxify:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Results - detoxify/multilingual:\n",
      " {'prompt': 'Eu te odeio.', 'prompt.toxicity': 0.9851344}\n",
      "Results - detoxify/original:\n",
      " {'prompt': 'I hate you.', 'prompt.toxicity': 0.9475088}\n"
     ]
    }
   ],
   "source": [
    "toxicity.init(model_path=\"detoxify/multilingual\")\n",
    "\n",
    "results = extract({\"prompt\": \"Eu te odeio.\"})\n",
    "print(f\"Results - detoxify/multilingual:\\n {results}\")\n",
    "\n",
    "toxicity.init(model_path=\"detoxify/original\")\n",
    "\n",
    "results = extract({\"prompt\": \"I hate you.\"})\n",
    "print(f\"Results - detoxify/original:\\n {results}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The `multilingual` model has been trained on the following languages: english, french, spanish, italian, portuguese, turkish or russian.\n",
    "\n",
    "You can have more information on the different models directly at the [detoxify](https://github.com/unitaryai/detoxify) repository."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Limitations and Ethical Considerations\n",
    "\n",
    "For more information on the limitations and ethical considerations for each of the related modules, please refer to:\n",
    "\n",
    "- https://huggingface.co/martin-ha/toxic-comment-model#limitations-and-bias\n",
    "- https://github.com/unitaryai/detoxify?tab=readme-ov-file#limitations-and-ethical-considerations"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "langkit-rNdo63Yk-py3.8",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
